Main
Alexandre Henrique S. Dias
I have an M.Sc. degree in Electrical and Computer Engineering and a MicroMasters Program Credential in Statistics and Data Science from the Massachusetts Institute of Technology (MIT). Currently, I work as a Data Scientist at QuintoAndar, where I primarily focus on the development of Automated Valuation Models (AVMs).
Industry Experience
Data Scientist
QuintoAndar
São Paulo, SP
Present - 2022
- Developing and improving the AVMs of the company, which requires a lot of data analysis, business understanding, programming, and discussions with stakeholders.
- Technologies: Python, AWS (Amazon SageMaker, S3), PySpark, SQL.
Data Scientist
Americanas S.A.
São Paulo, SP
2022 - 2021
- Responsible for developing ML models as solutions for the Human Resources Department. These applications encompass a broad spectrum of topics, with my primary focus being on NLP and HR Analytics. Additionally, I was responsible for creating ML pipelines.
- Technologies: Python (Scikit-Learn and Tensorflow), Docker, CI/CD (Kubeflow).
Data Scientist
Looqbox
São Paulo, SP
2021 - 2019
- Collaborated closely with clients to create and design custom Data Visualizations and Dashboards. Also, contributed to the development of the Looqbox’s proprietary R and Python packages, enriching the toolkit available for data analysis.
- Technologies: R, Python, SQL
Education
M. Sc. in Electrical and Computer Engineering
UFRN - Federal University of Rio Grande do Norte
Natal, RN
2023 - 2021
- Developed a multilabel classifier for the UN Sustainable Development Goals . Additionally, introduced a novel metric named F-Green, designed to assess models on imbalanced datasets, taking into account not only their performance but also their carbon footprint during training.
- Technologies and Tools: Python (Tensorflow, Scikit-Learn), DVC, GitHub Actions, Weights & Biases.
MITx Micromaster Program in Statistics and Data Science
MITx on EdX
EdX
2022 - 2020
- The MITx MicroMaster Program in Statistics and Data Science covers the fundamentals of data science, statistics, and machine learning.
B. Sc., Computer Engineering
UFRN - Federal University of Rio Grande do Norte
Natal, RN
2019 - 2018
- Researcher and member of the Modeling and Scientific Data Analysis team.
B. Sc., Sciences & Technology
UFRN - Federal University of Rio Grande do Norte
Natal, RN
2017 - 2015
- Linear Algebra and Analytical Geometry Teacher Assistant.
- Calculus II Teacher Assistant.
Certificates & Courses
MicroMasters in Statistics and Data Science
MITx on EdX
N/A
2022 - 2020
- 6.431x: Probability - The Science of Uncertainty and Data.
- 18.6501x: Fundamentals of Statistics.
- 6.86x: Machine Learning with Python - From Linear Models to Deep Learning.
- 14.310x/Fx: Data Analysis in Social Science.
- DS.CFx: Capstone Exam for Statistics and Data Science.
MLOps (Machine Learning Operations) Fundamentals
Coursera
N/A
2021
DataCamp completed tracks
DataCamp
N/A
2019 - 2018
- Data Scientist with Python.
- Data Analyst with Python.
- Data Manipulation with Python.
- Machine Learning with Python.
- Importing & Cleaning Data with Python.
- Python Programming.
- Python Programmer.
Academic Publications
Paper published in the 2019 II Workshop on Metrology for Industry 4.0 and IoT (MetroInd4.0&IoT). Naples, Italy.
Performance Evaluation of an Edge OBD-II Device for Industry 4.0
Institute of Electrical and Electronics Engineers
IEEE
2019
- Performance evaluation of an Edge OBD-II device that collects data from vehicles in an autonomous way in order to provide customer feedback and tracking
Research Experience
Undergraduate Researcher
Digital Metropolis Institute
UFRN
2019 - 2018
- Developed a traffic monitoring system using image recognition techniques.
Undergraduate Researcher
Department of Informatics and Applied Mathematics
UFRN
2017 - 2016
- Developed an interactive theorem prover based on Linear Logic using the Maude programming language.
Selected Data Science Writing
I enjoy reading about productivity, lifestyle, data science/AI, and statistics.
Dimensionality Reduction with Factor Analysis on Student Performance Data
N/A
2021
- A dimensionality reduction technique with interpretable outputs.
Stop Using the Elbow Method
N/A
2021
- Silhouette Analysis: A more precise approach to finding the optimal number of clusters using K-Means.
Scikit-Learn 1.0 - A true milestone
N/A
2021
- An overview of the design principles of Scikit-Learn and how the famous ML library became so popular.
The Expectation-Maximization (EM) Algorithm
N/A
2021
- Understanding the motivations and how the EM Algorithm works.
A mathematical derivation of the Law of Total Variance
N/A
2020
- Understanding what is and when to apply the Law of Total Variance.
Clustering with K-means: simple yet powerful
N/A
2019
- Explain what is Cluster Analysis, and how the K-means algorithm work providing its pros and cons.
An introduction to Linear Regression
N/A
2019
- Explain all assumptions behind Linear Regression, how to measure its performance, and how to implement it in Python.